Quest DMV Breakfast Code – Disk IO

Comments 0

Share to social media

For any of you who attended the Quest breakfast at PASS (and anyone else really) I wanted to go ahead and post the code and give a little wrap up/explanation. (I will post the stuff on perf counters later. They are cool, but this is really what I wanted to make sure people saw.)

I had a plan for my presentation, but time was less than I expected, and the good stuff kinda fell off. So here is the stuff around the Disk IO DMV section that I really messed up and didn’t show enough. I read a few of the comments that my presentation wasn’t that useful to you, and that is clearly on me.  If the index ones are useful, these are too, and can be used to help spread load around your disks.

So, here is the slide I presented:

 image

By knowing the amount of data written to a database file, you can see how busy the file is, the filegroup is, the database, the disk drive, etc, by aggregating the data in the following query in various way.  The query is pretty simple and will return all files in all databases (there are parms on the function to change that, but check BOL for more info):

–since reboot
SELECT db_name(mf.database_id) as databaseName,
            mf.physical_name, divfs.num_of_reads,
            divfs.num_of_bytes_read, divfs.io_stall_read_ms, 
            divfs.num_of_writes, divfs.num_of_bytes_written,
            divfs.io_stall_write_ms,
            divfs.io_stall,size_on_disk_bytes
FROM sys.dm_io_virtual_file_stats(null,null) as divfs
            JOIN sys.master_files as mf
                ON mf.database_id = divfs.database_id
                     and mf.file_id = divfs.file_id

 

The “stall” columns give you how long any processes have waited on your disk drives to serve up data for this file.  This information plus the information in _bytes_written and _bytes_read columns can give you how much data was moved, and how long your CPU had to wait for the data before it could get working on it.  Very powerful stuff, and can really give you a look at opportunities for performance tuning.

That query gives you data from the last reboot of the server, but usually you want to see up to date values. So you have to get a baseline of values, then compare them at a later point.  You can build tools that do this, and you can use the performance data warehouse in 2008, and I will demonstrate the former.  Using a temp table and a WAITFOR DELAY command, I build a quick tool to see waits and stats over 20 seconds. Note that I use a #temp table to allow you to run the last query over and over to see the comparison to the baseline until you want to drop the #temp table.

–over the following delay
declare @delay varchar(10) = ’00:00:20′

SELECT db_name(mf.database_id) as databaseName,
       mf.physical_name, divfs.num_of_reads,
       divfs.num_of_bytes_read, divfs.io_stall_read_ms,
       divfs.num_of_writes, divfs.num_of_bytes_written,
       divfs.io_stall_write_ms,
       divfs.io_stall,size_on_disk_bytes,
       getdate() as baselineDate
INTO #baseline
FROM sys.dm_io_virtual_file_stats(null,null) as divfs
            JOIN sys.master_files as mf
                ON mf.database_id = divfs.database_id
                     and mf.file_id = divfs.file_id

WAITFOR DELAY @delay

;WITH currentLine as (  
    SELECT  db_name(mf.database_id) as databaseName,  
            mf.physical_name,num_of_reads, num_of_bytes_read,
            io_stall_read_ms, num_of_writes,
            num_of_bytes_written, io_stall_write_ms,
            io_stall,size_on_disk_bytes,
                getdate() as currentlineDate
     FROM sys.dm_io_virtual_file_stats(null,null) as divfs
                JOIN sys.master_files as mf
                        ON mf.database_id = divfs.database_id
                            and mf.file_id = divfs.file_id)

SELECT  DATEDIFF(s,#baseline.baselineDate,currentlineDate),
        currentLine.databaseName,
      left(currentLine.physical_name,1) as drive,
      currentLine.physical_name
     ,currentLine.io_stall – #baseline.io_stall as io_stall_ms
     ,currentLine.io_stall_read_ms –
               #baseline.io_stall_read_ms as io_stall_read_ms
     ,currentLine.io_stall_write_ms –
              #baseline.io_stall_write_ms as io_stall_write_ms
     ,currentLine.num_of_reads – #baseline.num_of_reads as
                                               num_of_reads

    ,currentLine.num_of_writes – #baseline.num_of_writes
                                            as num_of_writes
   ,currentLine.num_of_bytes_written –
        #baseline.num_of_bytes_written as num_of_bytes_written
FROM currentLine
      join #baseline
          on #baseLine.databaseName = currentLine.databaseName
             and #baseLine.physical_name =
                              currentLine.physical_name
go
–drop table  #baseline

The most interesting part of this query is often the stall columns.  When I see really bad IO, the number of seconds waited can end up being way more than the actual number of seconds. Disk IO is more or less serial (particularly on my mini HP with Win7 Starter Edition), so if you have many queries running concurrently, every one of them needs data from the same IO source, and if one query used the disk for 1 second, the other processes might wait for 1 second each. When wait stats are really low, that means the disk were ready, willing, and able to serve all processes.

On your test server, you can use a query like:

–create database DMV
–go
use DMV
go
create table testDMV
(
    testDMVid int identity,
    bigun     char(8000)
)
go
set nocount on
insert into testDMV(bigun)
select REPLICATE(‘a’,8000)
go 5000

–drop table testDmv

select *
from   testDMV

 

To create some data.  Note that data will be written to the disk in various counts.  I set all of the rows to take a full data page, and each of the rows is an individual transaction.  But it won’t be a 1-1 match with the number of rows written or read. SQL Server flushes data to the disk in the manner it wants to, and some manner of caching will change the number over time.  Run the select * from testDMV query over and over and you will not see any change in the disk IO, because it is in cache now.

I hope this is better. Sometimes as a presenter I am a good writer, and frankly the lack of bacon really threw me off this morning. Feel free to email me at louis@drsql.org or tweet me at drsql and I will be happy to share more.  All of this will be in our RedGate book that will be coming out before the end of the millennium, at the very minimum.

Load comments

About the author

Louis Davidson

Simple Talk Editor

See Profile

Louis is the editor of this Simple-Talk website. Prior to that, has was a corporate database developer and data architect for a non-profit organization for 25 years! Louis has been a Microsoft MVP since 2004, and is the author of a series of SQL Server Database Design books, most recently Pro SQL Server Relational Database Design and Implementation.